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CELLOBIOHYDROLASE I GENE AND IMPROVED VARIENTS 

The United States Government has rights in this invention under contract number DE- 
AC36-99G0- 10337 between the United States Department of Energy and the National 
Renewable Energy Laboratory, a division of the Midwest Research Institute. 

The present application claims priority to PCT Application PCT/USOO/19007 filed July 
13, 2000. which is hereby incorporated by reference. PCT/USOO/ 19007 claims priority to U.S. 
Provision Application 60/143,711 filed July 13, 1999. 

Field of the Invention Tcchnical Fiel d. 

This invention relates to 1,4- (3-cellobiohydrolases or exoglucanases. More specifically, it 
relates to the Trichoderma reesei cellobiohydrolase I gene, the creation of reduced glycosylation 
variants of the expressed CBH I protein to enable the expression of active enzyme in 
heterologous hosts, and to the creation of new thermal stabile variants of the enzyme that instill 
higher thermal tolerance on the protein and improved performance. 

Background of the InventionArt. 

The surface chemistry of acid pretreated-biomass, used in ethanol production, is different 
from that found in plant tissues, naturally digested by fiingal cellulase enzymes, in two important 
ways: (1) pretreatment heats the substrate past the phase-transition temperature of lignin; and (2) 
pretreated biomass contains less acetylated hemicellulose. Thus, it is believed, that the cellulose 
fibers of pretreated-biomass are coated with displaced and modified Hgnin. This alteration results 
in a non-specific binding of the protein with the biomass, which impedes enzymatic activity. 
Moreover, where the pretreated biomass is a hardwood-pulp it contains a weak net-negatively 
charged surface, which is not observed in native wood. Therefore, for the efficient production of 
ethanol firom a pretreated biomass such as com stover, wood or other biomass it is desirable to 
enhance the catalytic activity of glycosyl hydrolases specifically the cellobiohydrolases. 

Trichoderma reesei CBH I (SEP ID NO: 5) is a mesophilic cellulase which plays a major 
role in the hydrolysis of cellulose. An artificial ternary cellulase system consisting of a 90:10:2 
mixture of T. reesei CBH I, Acidothermus cellulolyticus EI, and Aspergillus niger p-D- 
glucosidase is capable of releasing as much reducing sugar from pretreated yellow poplar as the 
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native T reesei system after 120 h. This result is encouraging for the ultimate success of 
engineered cellulase systems, because this artificial enzyme system was tested at 50°C, a 
temperature far below that considered optimal for EI, in order to spare the more heat labile 
enzymes CBH I and P-D-glucosidase. To increase the efficiency of such artificial enzyme 
systems it is desirable to engineer new T ressei CBH I variant enzymes capable of active 
expression in heterologous hosts. The use of the heterologous host Aspergillus awamori, could 
provide an excellent capacity for synthesis and secretion of T, reesei CBH I because of its ability 
to correctly fold and post-translationally modify proteins of eukaryotic origin. Moreover, A, 
awamori is believed to be an excellent test-bed for Trichoderma coding sequences and resolves 
some of the problems associated with site directed mutagenesis and genetic engineering in 
Trichoderma. 

In consideration of the foregoing, it is therefore desirable to provide variant cellulase 
enzymes having enzymatic activity when expressed in a heterologous host, and to provide 
variant cellulase enzymes that have improve thermal tolerance over the native as produced by 
Trichoderma reesei. 

Disclosur e Summary of the Invention, 

It is a general object of the present invention to provide variant cellulase enzymes having 
enzymatic activity when expressed in a heterologous host, such as a filamentous fungi or yeast. 

Another object of the invention is to provide a variant exoglucanases characterized by a 
reduction in glycosylation when expressed in a heterologous host. 

Another object of the invention is to provide an active cellobiohydrolase enzyme capable 
of expression in heterologous fungi of including yeast. 

Another object of the invention is to provide improved thermal tolerant variants of the 
cellobiohydrolase enzyme capable of functioning at increased process temperatures. 

It is yet another object of the invention to provide a method for reducing the 
glycosylation of a cellobiohydrolase enzyme for expression in a heterologous host. 

The foregoing specific objects and advantages of the invention are illustrative of those 
which can be achieved by the present invention and are not intended to be exhaustive or limiting 
of the possible advantages which can be realized. Thus, those and other objects and advantages 
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of the invention will be apparent from the description herein or can be learned from practicing 
the invention, both as embodied herein or as modified in view of any variations which may be 
apparent to those skilled in the art. 

Briefly, the invention provides a method for making an active cellobiohydrolase in a 
heterologous host, the method comprising reducing glycosylation of the cellobiohydrolase, 
reducing glycosylation further comprising reducing an N-glycosylation site amino acid residue 
with a non-glycosyl accepting amino acid residue. The invention further provides a 
cellobiohydrolase, comprising the reduced glycosylation variant cellobiose enzymes CBHI- 
N45A; CBHI-N270A; or CBHI-N384A, or any combination thereof. 

Be s t Mode for Carrvia fr^ Detailed Description of the Invention. 

Unless specifically defined otherwise, all technical or scientific terms used herein have 
the same meaning as commonly understood by one of ordinary skill in the art to which this 
invention belongs. Although any methods and materials similar or equivalent to those described 
herein can be used in the practice or testing of the present invention, the preferred methods and 
materials are now described. 

The terms "native" and "wild-type" are used interchangeablv throughout this disclosure to 
indicate the origin of the molecule as it occurs in nature. 

A method for reducing the glycosylation of an expressed Trichoderma reesei CBHI 
protein by site-directed mutagenesis ("SDM") is disclosed. The method includes replacing an N- 
glycosylation site amino acid residue, such as asparagines 45, 270, and/or 384 (referenced herein 
as CBHI-N45A, CBHI-N270A and CBHI-N384A, respectively), with a non-glycosyl accepting 
amino acid residue, such as is alanine. Various mutagenesis kits for SDM are available to those 
skilled in the art and the methods for SDM are well known. The description below discloses a 
procedure for making and using CBHI variants: CBHI-N45A (SEP ID NO: 6) : CBHI-N270A 
(SEP ID NO: 7) : and CBHI-N384A (SEC ID NO: 8) . The examples below demonstrate the 
expression of active CBHI in the heterologous fungus Aspergillus awaniori. 

Variants of CBH I embodiments include mutations that provide for improved end product 
inhibition and for thermal tolerance. 
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Brief Description of the Figures. 

Figure 1. Coding sequence of fef the cbh 1 gene (SEQ ID NO: 4). Small Lower case 
letters represent the signal sequence, larg e upper case letters the catalytic domain, bolded italics 
the linker region, and tege u pper case underlined the cellulose-binding domain. 

Figure 2, SDS-PAGE Western blot with anti-CBH I antibody showing the reduction on 
molecular weight of rCBH I expression clones as a function of introduction of N to A 
modifications. 

Figure 3. Plasmid map for the fungal expression Vector pPFE2/CBH 1. 

Figure 4. Coding Nucleotide sequence^ SEQ ID NO: i 5^ 

CCTCCCGGCGGAAACCCGCCTGGCACCACCACCACCCGCCGCCCA-3\ coding for the 
linker region . PPGGNiPPGTTTTRRP (SEP ID NO: 2\ of the CBH I protein, cbh 1 g e n e , SEQ 
ID NO: 4 , showing additional proline residues nucl e otid e s that effect conformation of the linker 
region in the protein structure. 

lExampie 1. Acquisition of the CBH I Encoding Sequence. 

Acquisition of the gene was done by either cDNA cloning or by PCR of the gene from 
genomic DNA. CBH I cDNA was isolated from a T. reesei strain RUT C-30 cDNA library 
constructed using a PCR-generated probe based on pubUshed CBH I gene sequences 
(Shoemaker, et al, 1983). The cDNA's were cloned (using the Zap Express cDNA kit from 
Stratagene; cat. #200403) into the Xhol and EcoRI site(s) of the supplied, pre-cut lambda arms. 
An Xhol site was added to the 3' end of the cDNA during cDNA synthesis, and sticky-ended RE 
linkers were added to both ends. After Xhol digestion, one end has an Xhol overhang, and the 
other (5' end) has an Eco RI overhang. The insert can be removed from this clone as an 
approximately 1.7 kb fragment using Sail or Spel plus Xhol in a double digest. There are two 
Eco RI, one Bam HI, 3 Sad and one Hindlll sites in the coding sequence of the cDNA itself 
The plasmid corresponding to this clone was excised in vivo from the original lambda clone, and 
corresponds to pB210-5A. Thus, the cDNA is inserted in parallel with a Lac promoter in the 
pBK-CMV parent vector. Strain pB210-5A grows on LB + kanamycin (50 ^igug/mL). 

Acquisition of the cbh 1 gene was also achieved by PCR of genomic DNA. With this 
approach the fungal chromosomal DNA from T. reesei strain Rut C-30 was prepared by grinding 
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the fiingal hypae in liquid nitrogen using a mortar and pestle to a fine powder. The genomic 
DNA was then extracted from the cell debris using a Qiagen DNAeasy Plant Mini kit. 
Amplification of the DNA fragment that encodes for the cbh 1 gene, including introns, was 
performed using polymerase chain reaction (PCR) with specific primers for the T. reesei cbh 1 
gene. The primers 5*-AGAGAGTCTAGACACGGAGCTTACAGGC-3' (SEP ID NO: 9) that 
introduces a Xba I site and the primer 5'- 

AAAGAAGCGCGGCCGCGCCTGCACTCTCCAATCGG-3' (SEP ID NO: 97) t hat introduces 
a unique Not I site were used to allowing cloning into the pPFE Aspergillus/E. coli shuttle 
vectors that are described below. The amplified PCR product was then gel purified and cloned 
directly into the vectors. 

Example 2. Production of Active Recombinant CBH I (rCBH I) in Aspergillus awamori. 
Construction of the Fungal Expression Vectors pPFE-l/CBH 1111 and pPFE-2/CBH 111], 

The coding sequence for T reesei CBH I was successfiiUy inserted and expressed in 
Aspergillus awamori using the fimgal expression vector pPFE2 (and pPFEl). Vectors pPFEl and 
pPFE2 are E, coli - Aspergillus shuttle vectors, and contain elements required for maintenance in 
both hosts. Both pPFE-1 and pPFE-2 vectors direct the expression of a fusion protein with a 
portion of the glucoamylase gene fiised to the gene of interest. The pPFEl vector contains a 
region of the glucoamylase gene, with expression under the control of the A. awamori 
glucoamylase promoter. The protein of interest is expressed as a fiision protein with the secretion 
signal peptide and 498 amino acids of the catalytic domain of the glucoamylase protein. The 
majority of the work presented here was done using the pPFE2 expression vector, which was 
chosen because of its smaller size, simplifying the PCR mutation strategy by reducing extension 
time. 

The major features of the pPFE2-CBHl construct are shown in Figure 3. With both the* 
pPFEl/CBHl and the pPFE2/CBHl vectors, the sequence immediately upstream of the Not I site 
encodes a LysArg dipeptide. A host KEX-2 like protease recognizes this dipeptide sequence 
during the secretion process, and the fusion peptide is cleaved, removing the glucoamylase 
secretion signal peptide or the longer catalytic domain of glucoamylase in the case of pPFEl. In 
this way, the recombinant CBH I protein experiences an "efficient ride" through the A, awamori 
secretion system and is expressed with the native N-terminal protein. The net result is that the 

recombinant CBH I is processed so that it can accumulate in the medium without its 
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glucoamylase secretion signal .fusion partner. The vector contains the Streptoalloteichus 
hindustanus phleomycin resistance gene, under the control of the A. niger P-tubulin promoter, 
for positive selection of Aspergillus transformants. The pPFE/CBHl vector also contains a P- 
lactamase gene for positive selection using ampilcillin in E, coli, and also contains the A. niger 
trpC terminator. The insertion of the CBH I coding sequence into the pPFE vectors was 
accomplished using two methods. Vector DNA was first produced in 500 mL cultures of ^. coli 
XLl Blue and the plasmids purified using Promega maxi-preps DNA purification kits. 

Approach 1: Blunt-Xba I fragment generation, 

1 . Oligonucleotides were designed to give a blunt end on the 5' end and an engineered Xba I 
site on the 3'end of the PGR fi'agment. 

2. The full-length coding sequence for CBH I was obtained by PGR using Pfu DNA 
polymerase and using the cDNA construct pB510-2a as the template. Pfix DNA 
polymerase generates blunt-ended PGR products exclusively. 

3. The pPFE vectors were digested using Not! and confirmed by agarose gel 
electrophoresis. The NotI overhang was then digested using Mung Bean nuclease. The 
DNA was purified and the vector and GBH I PGR fi-agment digested using Xbal. 

4. The vector and PGR product were then ligated using T4 DNA ligase and the DNA used 

to transform E. coli XL-1 Blue and E, coli DH5a using electroporation. 

Approach 2; Notl-Xbal fragment approach. 

1. Oligonucleotides were designed to give a Not 1 site on the 5' end, and an engineered Xba 
I site on the 3' end of the PGR fi-agment. 

2. The full-length coding sequence for GBH I was obtained by PGR using Pfii DNA 
polymerase and using the cDNA construct pB510-2a as the template. 

3. The pPFE vectors and the PGR product were digested using Not 1 and Xba 1 

4. The GBH I PGR product was directionally cloned into the pPFE2 vector using T4 DNA 
ligase and transformed into E, coli XL-1 Blue. 
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5. The insertion of the CBH I coding sequence into the pPFE2 vector was confirmed using 
PCR, restriction digest analysis, and DNA sequencing through the insertion sites. The 
entire coding sequence of the insert was also confirmed by DNA sequencing. 

The constructs produced using these two methods was then used to transform A. awamori 
and to express rCBH I, as confirmed by western blot analysis of culture supematant. The rCBH I 
expressed in A, awamori tends to be over glycosylated as evidenced by the higher molecular 
weight observed on western blot analysis. Over glycosylation of CBH I by A. awamori was 
confirmed by digestion of the recombinant protein with endoglycosidases. Following 
endoglycosidase H and F digestion, the higher molecular weight form of the protein collapses to 
a molecular weight similar to native CBH I. 

Example 3. Method for producing PCR site directed mutations for glycosylation removal 
and improved thermalstability. 

The QuickChange™ Site Directed Mutagenesis kit (StrataGene, San Diego, CA) was 
used to generate mutants with targeted amino acid substitutions. To introduce these specific 
amino acid substitutions, mutagenic primers (between 25 and 45 bases in length) were designed 
to contain the desired mutation that would result in the targeted amino acid substitution. Pfu 
DNA polymerase was then used to amplify both strands of the double-stranded vector, which 
contained the CBH I insertion sequence, with the resuUant inclusion of the desired mutation from 
the synthetic oligonucleotides. Following temperature cycling, the product was treated with the 
exonuclease Dpn I to digest the parental methylated DNA template and the PCR product was 
used to transform Epicurian Coli XLl-Blue supercompetent cells. 

The vector pPFE2/CBHI requires a relatively long PCR reaction (8.2 kB) to make site- 
specific changes using the Stratagene Quik Change protocol The PCR reaction was optimized as 
follows using a GeneAmp PCR System 2400, Perkin Elmer Corporation. The reaction mixture 
contained 50 ng of template DNA, 125 ng each of the sense and antisense mutagenic primers, 
5 mL of Stratagene lOX cloned Pfu buffer, 200 of each: dNTP, 5 mM MgCh (total final 
concentration of MgCh is 7 mM); and 2.5 U Pfu Turbo DNA polymerase. The PCR reaction was 
carried out for 30 cycles, each consisting of one minute denaturation at 96°C, 1 minute annealing 
at 69°C and a final extension for 10 min at TS'^G, followed by a hold at 4°C. Agarose gel 
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electrophoresis, ethidium bromide staining, and visualization under UV transillumination were 
used to confirm the presence of a PCR product. 

PCR products were digested with the restriction enzyme Dpnl, to degrade un- 
mutagenized parental DNA, and transformed into £. coli (Stratagene Epicurian Coli 
Supercompetent XL-1 Cells). Ampicillin resistant colonies were picked from LB-amplOO plates 
and mutations were confirmed by DNA sequencing. 

Template DNA from E. coli XLl-blue cells transformed with Dpnl treated mutaginzed 
DNA was prepared for sequencing using the QIAprep-spin plasmid purification mini-prep 
procedure (Qiagen, Inc.). The transformed XLl-blue cells where grown overnight in 5 mL of LB 
broth with 100 |iig/mL ampicillin selection. Cells were removed by centrifugation and the 
plasmid isolated using the protocol outlined in the QL\prep-spin handbook. The concentration of 
the template DNA was adjusted to 0.25 ^g/|iL and shipped along with sequencing 
oligonucleotides to the DNA Sequencing Facility at Iowa State University. 

After the mutation was confirmed by DNA sequence alignment comparisons using the 
software package OMIGA, and the DNA was prepared for transformation of A. awamori. The 
transformed E, coli XLl/blue cells were grown overnight on LB plates with 100 ^g/mL 
ampicillin at 37X. A single colony was then used to inoculate a 1 L baffled Erlenmeyer flask 
that contained 500 mL of LB broth and 100 |Lig/mL ampicillin. The culture was allowed to grow 
for 16 to 20 hours at 37°C with 250 rpm shaking in a NBS reciprocating shaking incubator. The 
cells were harvested and the plasmid DNA purified using a Promega maxi-prep purification kit. 
The purified maxi-prep DNA was subsequently used to transform ^4. awamori spheroplasts using 
the method described below. 

Transformation of Aspergillus awamori yvith Tricboderma reesei CBH I coding sequence. 
Generating Fungal Spheroplasts. 

A. awamori spheroplasts were generated firom two-day-old cultures of mycelia pellets. A 
heavy spore suspension was inoculated into 50 mL of CM broth (5.0 g/L yeast extract; 5.0 g/L 
tryptone; 10 g/L glucose; 50 mL/L 20X Clutterbuck's salts, pH 7.5 (adjusted by addition of 2.0N 
NaOH)) and grown at 225 rpm and 28°C in a baffled 250 mL Erlenmeyer flask. The mycelia 
were collected by filtration through Miracloth and washed with --'200 mL KCM (0.7M KCl; 
lOmM MOPS pH 5.8). The washed mycelia were transferred to 50 mL, of KCM + 500 mg 
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Novazym 234 in a 50-mL unbaffled flask and incubated 0/N at 80rpm and 30°C. After 
digestion, the remaining mycelia was removed by filtration through Miracloth and the 
spheroplasts were collected in 50 mL disposable tubes and pelleted at 2500 x g in a swinging 
bucket rotor for 15 minutes. The supernatant was discarded and the spheroplasts gently 
resuspended in 20 mL 0.7M KCl by tituration with a 25-mL disposable pipet. The spheroplasts 
were pelleted and washed again, then resuspended in 10 mL KC (0.7M KCl + 50mM CaCl2). 
After being pelleted, the spheroplasts were resuspended into LO mL of KC. 

Transformation was carried out using 50 nL of spheroplasts + 5 nL DNA (pPFEl or 
pPFE2 -200 ^g/mL) + 12,5 nL PCM (40% PEG8000 + 50mM CaC12 + 10 mM MOPS pH 5.8). 
After incubation for 60 mins on ice, 0.5mL PCM was added and the mixture was incubated for 
45 mins at room temperature. One milliliter of KCl was added and 370 ^L of the mix was added 
to 10 mL of molten CMK (CM + 2% agar + 0.7M KCl) top agar at 55°C. This mixture was 
immediately poured onto a 15mL CM170 plate (CM + 2% agar + HO^ig/mL Zeocin). Negative 
transformation controls substituted sterile dHiO for DNA. Plating the transformation mix onto 
CM plates without Zeocin performed positive spheroplast regeneration controls. The poured 
plates were incubated at 28''C in the dark for 2-7 days. 

Transformation of AspereilJus Bwamori with native and modified CBH I coding sequence. 

Aspergillus awamori spore stocks were stored at -70°C in 20% glycerol, 10% lactose. 
After thawing, 200 |iL of spores were inoculated into 50 mL CM broth in each of eight-baffled 
250 mL Erlenmeyer flask. The cultures were grown at 28°C, 225 rpm for 48 h. The mycelial 
balls were removed by filtration with sterile Miracloth (Calbiochem, San Diego, CA) and 
washed thoroughly with sterile KCM. Approximately 10 g of washed mycelia were transferred to 
50 mL KCM + 250 mg Novozym234 in a 250 mL baffled Erlenmeyer flask. The digestion 
mixture was incubated at 30°C, 80 rpm for 1-2 h and filtered through Miracloth into 50 mL 
conical centrifuge tubes. The spheroplasts were pelleted at 2000xg for 15 min and resuspended 
in 0.7M KCl by gentle tituration with a 25 mL pipette. This was repeated once. After a third 
pelleting, the spheroplasts were resuspended in 10 mL KC, pelleted and resuspended in 0.5 mL 
KC using a wide-bore pipet tip. The washed spheroplasts were transformed by adding 12,5 ^iL 
PCM and 5 |aL DNA (-'0.5 |ag/|iL) to 50 ^L of spheroplasts in sterile 1.5 mL Eppendorf tubes. 
After incubation on ice for 45 minutes, 0.5 mL of room temperature PCM was added to the 
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transformation mixture and was mixed by tituration with a wide bore pipet tip. The mixture was 
incubated at room temperature for 45 minutes. One milliUter of KC was added and mixed. The 
mixture was allocated between four tubes of CM top agar at 55°C, which were each poured over 
a 15 mL CM170 plate. The plates were incubated at 28°C for 2-3 days. Subsurface colonies were 
partially picked with a sterile wide bore pipet tip, exposing the remaining part of the colony to air 
and promoting rapid sporulation. After sporulation, spores were streaked onto several successive 
CM 100 or CM300 plates. After a monoculture was estabHshed, heavily sporulated plates were 
flooded with sterile spore suspension medium (20% glycerol, 10% lactose), the spores were 
suspended and aliquots were frozen at -70°C. Working spore stocks were stored on CM slants in 
screw cap tubes at 4°C. Protein production was confirmed and followed by western blot using 
anti-CBH I monoclonal antibodies and the Novex Western Breeze anti-mouse chromogenic 
detection kit (Novex, San Diego, CA). Extracting genomic DNA using the YeaStar Genomic 
DNA Kit (Zymo Research, Orange, CA) and carrying out PCR with pfu-tarho DNA polymerase 
(Stratagene, La JoUa) and cbh I primers confirmed insertion of the gene. 

Production and purification of native rCBH I enzyme from Aspergillus awamorL 

For enzyme production, spores were inoculated into 50 mL CM basal starch medium, pH 
7.0, and grown at 32°C, 225 rpm in 250 mL baffled flasks. The cultures were transferred to 1.0 L 
of basal starch medium in 2800 mL Fembach flasks and grown under similar conditions. For 
large-scale enzyme production (>1 mg), these cultures were transferred to 10 L basal starch 
medium in a New Brunswick BioFlo3000 fermenter (10-L working volume) maintained at 20% 
DO, pH 7.0, 25°C, and 300 rpm. The fermentation was harvested by filtration through Miracloth 
after 2-3 days of growth. 

After ftirther clarification by glass fiber filtration, the rCBH I protein was purified by 
passing the fermentation broth over four CBinD900 cartridge columns (Novagen, Madison, WI) 
connected in parallel* using a Pharmacia FPLC System loading at 1.0 mL/min (Amersham 
Pharmacia Biotech, Inc., Piscataway, NJ). The cartridges were equilibrated in. 20 mM Bis-Tris 
pH 6.5 prior to loading and washed with the same buffer after loading. The bound rCBH I was 
then eluted with 100% ethylene glycol (3 mL/column) using a syringe. Altematively, the 
supernatant was passed over a para-aminophenyl p-D-cellobioside affinity column, washed with 
100 mM acetate buffer, pH 5.0, ImM gluconolactone and eluted in the same buffer containing 
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lOmM cellobiose. In either method, the eluted rCBH I was concentrated in Millipore Ultrafree- 
15 spin concentrator with a lOkDa Biomax membrane to <2.0 mL and loaded onto a Pharmacia 
SuperDex200 16/60 size-exclusion column. The mobile phase was 20 mM sodium acetate, 100 
mM sodium chloride, and 0.02% sodium azide, pH 5.0 running at 1.0 mL/min. The eluted 
protein was concentrated and stored at 4°C. Protein concentrations were determined for each 
mutant based upon absorbance at 280 nm and calculated from the extinction coefficient and 
molecular weight for each individual protein as determined by primary amino acid sequence 
using the ProtParam tool on the ExPASy website ( http://www.expasv.ch/tools/protparam.html) . 

Clutterbuck's Salts (20X) 

Na2N03 120.0 g 

KCl 10.4 

MgS04«7H20 10.4 

KH2PO4 30.4 

CM- 

Yeast Extract- 5 g/L 

Tryptone- 5 g/L 

Glucose- 10 g/L 

Clutterbuck's Salts- 50 mL 

Add above to 900 mL dH20, pH to 7.5, bring to 1000 mL 

CM Agar -CM + 20g/L Agar 

CMK -CM Agar + 0.7M KC 1 

CMIOO -CM + 100 ng/mL Zeocin (hivitrogen, Carlsbad, CA) 
CM170 -CM + 1 70 ng/mL Zeocin, 1 5mL/plate 
KC1-0.7MKC1 



KC-0.7M KCl + 50 mM CaCh 

KCM- 0.7M KCl + lOmM MOPS, pH 5.8 

PCM -40% PEG 8000, 50 mM CaCl2, 10 mM MOPS pH 5.8 

(mix 4 mL 50% PEG + 0.5 mL 500 mM CaCh stock + 0.5 mL 100 mM MOPS stock) 

Basal Starch Medium - 

Casein Hydrolysate, Enzymatic 5 g/L 

NH4CL 5 g/L 

Yeast Extract 10 g/L 

Tryptone 10 g/L 
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MgS04 *7H20 2 g/L 

Soluble Starch 50 g/L . 

Buffer (Bis-Tris-Propane) 50 mM 

pH to 7.0 with NaOH 

Example 4. Production of Reduced GIvcosvlation rCBH I: Sites N270A: N45A: and 
N384A. 

rCHI/pPFB2 has been optimized using site-directed mutagenisis to achieve expression of 
native molecular weight CBH I in A. awamori by the following ways. The QuickChange SDM 
kit (Stratagene, San Diego, CA) was used to make point mutations, switch amino acids, and 
delete or insert amino acids in the native cbh 1 gene sequence. The Quick Change SDM 
technique was performed using thermotolerant Pfu DNA polymerase, which replicates both 
plasmid strands with high fidelity and without displacing the mutant oligonucleotide primers. 
The procedure used the polymerase chain reaction (PGR) to modify the cloned cbh 1 DNA. The 
basic procedure used a supercoiled double stranded DNA (dsDNA) vector, with the cbh J gene 
insert, and two synthetic oligonucleotide primers containing a desired mutation. The 
oligonucleotide primers, each complimentary to opposite strands of the vector, extend during 
temperature cycling by means of the polymerase. On incorporation of the primers, a mutated 
plasmid containing the desired nucleotide substitutions Was generated. Following temperature 
cycling, the PGR product was treated with a Dpnl restriction enzyme. Dpnl is specific for 
methylated and hemi-methylated DNA and thus digests the unmutated parental DNA template, 
selecting for the mutation-containing, newly synthesized DNA. The nicked vector DNA, 
containing the desired mutations, was then transformed into E. coli. The small amount of 
template DNA required to perform this reaction, and the high fidelity of the Pfu DNA 
polymerase contribute to the high mutation efficiency and minimizes the potential for the 
introduction of random mutations. Three glycosylation-site amino acids on the pro surface were 
targeted for substitution of an alanine (A) residue in place of asparagines (N). Single site 
substitutions were successfully completed in the cbh 1 coding sequence at sites N45, N270, and 
N384, of Seq. ID NO: 4 by site-directed mutagenisis, and confirmed by DNA sequencing. 

Double and triple combinations of this substitution have also been completed in the cbh 1 
coding sequence at sites N45, N270, and N384 by site directed mutagenisis and confirmed by 
DNA sequencing. These double and triple site constructs also yield rCBH I enzymes with 

reduced glycosylation and, presumably, native activity. 
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Table 1. 





Tf net 




l^mol pNPL 


Vmax 

^mol pNP/min/mg 
protein 


T. reesei 


None 


57.8 


1.94 


0.746 


rCBHI wt cDNA 


A. awamori 


63.3 


2.14 


0.668 


rCBHI wt genomic 


A. awainori 


63.3 


— 


-- 


rCBHI N270A 


A, awamori 


61.7 


2.25 


0.489 


rCBHI N384A 


A, awamori 


61.3 






rCBHI wt genomic (G) 


A, awamori 


63.3 






rCBH I N45A 


A, awamori 


58.3 






rCBHI N270/45A 


A. awamori 


58.3 






rCBHI N384/270A 


A. awamori 


58.8 







As shown in Table 1, Western blot analysis of the supernatant, obtained from a single 
glycosylation site mutant CBHIN270A culture expressed in A, awamori, demonstrated that a 
decrease, to lower molecular weight (61.7 kDa), in the amount of glycosylation of the protein 
had occurred, as compared to that in the wild type cDNA (63.3 kDa), and the wild type genomic 
DNA (63.3 kDa). These results demonstrate a reduction in the level of glycosylation in the 
reduced glycosylation mutant CBHIN270A, via expression in A. awamori. It is also shown, in 
the Table, that the CBHIN270A enzyme nearly retained its native enzymatic activity when 
assayed using the pNPL substrate. The variants CBHIN45A and CBHI384A also demonstrate a 
reduction in amount of glycosylation and native activity when expressed from the heterologous 
host A. awamori and when combined in the double mutations CBHIN270/45A and 
CBHIN270/384A reduce the level of glycosylation further. 

Example 5, Amino Acid Mutations Targeted To Improve Thermal Tolerance Of CBH I 
Helix Capping Mutants. 

All a-helices display dipole moments, i.e. positive at N-terminal and negative at C- 

terminal. Compensation for such dipole moments (capping) has been observed in a number of 

protein structures ^' ^ and has been shown to improve the protein stability. For example, the 

introduction of a negatively charged amino acid at the N-terminus and a positively charged 

amino acid at C-terminus of an a-helix increased the thermostability of T4 lysozyme and hen 
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lysozyme"^, via an electrostatic interaction with the "helix dipole." Five amino acid sites were 
identified for helix capping (see Table 5). 

Peptide Strain Removal Mutants* 

A small fraction of residues adopt torsion angles, phi-psi angles, which are unfavorable. 
It has been shown that mutation of such residues to Gly increased the protein stability as much as 
4 kcal/mol. One amino acid site was selected for peptide strain removal (see Table 3). 

Helix Propensity Mutants, 

Two amino acid sites were selected for helix propensity improvement. 
Disulfide Bridge Mutants. 

Disulfide bonds introduced between amino acid positions 9 and 164 and between 21 and 
142 in phage T4 lysozyme have been shown to significantly increase the stability of the 
respective enzymes toward thermal denaturation . The engineered disulfide bridge between 
residues 197 and 370 of CBH I should span the active site cleft and enhance its thermostability. 
The active site of CBH lis in a tunnel. The roof over the tunnel appears to be fairly mobile (high 
temperature-factors). At an elevated temperature the mobility of the tunnel is too significant to 
position all the active site residues. The disulfide linkage should stabilize the roof of the tunnel 
making the enzyme a consistent exocellulase even at a high temperature. Two amino acid sites 
were identified for new disulfide bridge generation. 

Deletion Mutants, 

Thermostable proteins have shorter loops that connect their structural elements than 
typical proteins. Our sequence alignment of CBH I, with its close homologs, suggests that the 
following residues may be deleted without significantly affecting its function. These loops 
exhibited high mobility as well. Three loops were identified, but these modifications were 
considered high risk (buried hydrophobic regions may be exposed to solvent upon deletion of a 
natural loop) and will be saved for future work. 

Proline Replacement Mutants. 

The unique structure of proline dictates that fewer degrees of fireedom are allowed around 
the alpha carbon that most other amino acids. The result of this structure is that peptides tend to 
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loose flexibility in regions rich with proline. In order to assess possible sites for replacement of 
existing amino acids with proline, the phi/psi angles of candidate amino acid sites must conform 
with those consistent with proline. Each new site must also be evaluated for allowable side chain 
interactions and assurance that interactions with substrate are not altered. Seventeen amino acid 
sites were identified for proline replacement (See Table 2). 

Example 6. Nucleic acid sequence of a variant exoglucance. 

The present example demonstrates the utility of the present invention for providing a 
nucleic acid molecule having a nucleic acid sequence that has a sequence 5'- 
GGCGGAAACCCGCC7GGCACCACC-3' (SEQ ID NO: 3). The identified nucleic acid 
sequence presents a novel linker region nucleic acid sequence that differs from previously 
reported nucleic acid sequence by the addition of one (i) codon, and the alteration of an adjacent 
codon, both encoding a proline (See Figure 4) . The invention in some aspects thus provides a 
nucleic acid molecule encoding a cellobiohvdrolase having a nucl e ic acid s e quence that 
comprises a linker region of about 20 to 60 nucleotides 6 to 20 amino acids in length as 
identified here. 



Table 2. Proline mutations to improve thermal tolerance. 



Mutation 


Native sequence and mutatgenic oligonucleotide 


SEQ ID NO: 10 

S8P - native sense strand 


5'-GCACTCTCCAATCGGAGACTCACCCG-3' 


SEQ ID NO: 11 

Mutagenic sense strand 


5'-GCACTCTCCAACCGGAGACTCACCCG-3' 


SEQ ID NO: 12 
Mutagenic anti-sense strand 


5'-CGGGTGAGTCTCCGGTTGGAGAGTGC-3' 




SEQ ID NO: 13 

N27P - native sense strand 


5'-GGCACGTGCACTCAACAGACAGGCTCCG-3' 


SEQ ID NO: 14 

Mutagenic sense strand 


5'-GGCACGTGCACTCCACAGACAGGCTCCG-3' 


SEQ ID NO: 15 
Mutagenic anti-sense strand 


5'-CGGAGCCTGTCTGTGGAGTGCACGTGCC-3' 




SEQ ID NO: 16 

A43P - native sense strand 


5'-GGCGCTGGACTCACGCTACGAACAGCAGCACG-3' 


SEQ ID NO: 17 
Mutagenic sense strand 


5'-GGCGCTGGACTCACCCTACGAACAGCAGCACG-3* 


SEQ ID NO: 18 
Mutagenic anti-sense strand 


5'-CGTGCTGCTGTTCGTAGGGTGAGTCCAGCGCC-3' 
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SEQIDNO: 19 

G75P - native sense strand 


5'-GCTGTCTGGACGGTGCCGCCTAGGCG-3' 


SEQ ID NO: 20 

Mutagenic sense strand 


5'-GCTGTCTGGACCCTGCCGCCTACGCG-3' 


SEQ ID NO: 21. 
Mutagenic anti-sense strand 


5'-CGCGTAGGCGGCAGGGTCCAGACAGC-3' 




SEQ ID NO: 22 

G94P - native sense strand 


5'-GCCTCTCCATTGGCTTTGTf:ACCC-3' 


SEQ ID NO: 23 
Mutagenic sense strand 


5'-GCCTCTCCATTCCCTTTGTCACCC-3' 


SEQ ID NO: 24 
Mutagenic anti-sense strand 


5'-GGGTGACAAAGGGAATGGAGAGGC-3' 




SEQ ID NO: 25 

EI90P ' native sense strand 


5*-GGCCAACGTTGAGGGCTGGGAGCC-3' 


SEQ ID NO: 26 
Mutagenic sense strand 


5'-GGCCAACGTTCCGGGCTGGGAGCC-3' 


SEQ ID NO: 27 
Mutagenic anti-sense strand 


5»-GGCTCCCAGCCCGGAACGTTGGCC-3' 




SEQ ID NO: 28 

SI95P - native sense strand 


5'-GGCTGGGAGCCGTCATCCAACAACGCG-3' 


SEQ ID NO: 29 
Mutagenic sense strand 


5*-GGCTGGGAGCCGCCATCCAACAACGCG-3' 


SEQ ID NO: 30 
Mutagenic anti-sense strand 


5'-CGCGTTGTTGGATGGCGGCTCCCAGCC-3' 




SEQ ID NO: 31 

K287P - native sense strand 


5'-CGATACCACCAAGAAATTGACCGTTGTCACCC-3' 


SEQ ID NO: 32 
Mutagenic sense strand 


5'-CGATACCACCAAGCCATTGACCGTTGTCACCC-3' 


SEQ ID NO: 33 
Mutagenic anti-sense strand 


5'-GGGTGACAACGGTCAATGGCTTGGTGGTATCG-3' 




SEQ ID NO: 34 

A299P - native sense strand 


5'-CGAGACGTCGGGTGCCATCAACCGATAC-3' 


SEQ ID NO: 35 
Mutagenic sense strand 


5*-CGAGACGTCGGGTCCCATCAACCGATAC-3' 


SEQ ID NO: 36 
Mutagenic anti-sense strand 


5*-GTATCGGTTGATGGGACCCGACGTCTCG-3' 




SEQ ID NO: 37 

Q312P/N3I5P - native sense strand 


5'-GGCGTCACTTTCCAGCAGCCCAACGCCGAGCTTGG-3' 


SEQ ID NO: 38 
Mutagenic sense strand 


5'-GGCGTCACTTTCCCGCAGCCCCCCGCCGAGCTTGG-3' 


SEQ ID NO: 39 


5'-CCAAGCTCGGCGGGGGGCTGCGGGAAAGTGACGCC-3* 
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Mutagenic anti-sense strand 




SEQ ID NO: 40 

G359P - native sense strand 


5'-GGCTACCTCTGGCGGCATGGTTCTGG-3' 


SEQ ID NO: 41 
Mutagenic sense strand 


5'-GGCTACCTCTCCCGGCATGGTTCTGG-3' 


SEQ ID NO: 42 
Mutagenic anti-sense strand 


5'-CCAGAACCATGCCGGGAGAGGTAGCC-3' 




SEQ ID NO: 43 

S398P/S40I P- native sense strand 


5'-GCGGAAGCTGCTCCACCAGCTCCGGTGTCCCTGC-3' 


SEQ ID NO: 44 

Mutagenic sense strand 


5'-GCGGAAGCTGCCCCACCAGCCCCGGTGTCCCTGC-3' 


SEQ ID NO: 45 
Mutagenic anti-sen^e strand 


5'-GCAGGGACACCGGGGCTGGTGGGGCAGCTTCCGC-3' 




SEQ ID NO: 46 

A4J4P- native sense strand 


5'-GTCTCCCAACGCCAAGGTCACC-3' 


SEQ ID NO: 47 
Mutagenic sense strand 


5'-GTCTCCCAACCCCAAGGTCACC-3' 


SEQ ID NO: 48 
Mutagenic anti-sense strand 


5'-GGTGACCTTGGGGTTGGGAGAC-3' 




SEQ ID NO: 49 

N43IP/S433P - native sense strand 


5'-GGCAGCACCGGCAACCCTAGCGGCGGCAACCC-3' 


SEQ ID NO: 50 
Mutagenic sense strand 


5'-GGCAGCACCGGCCCCCCTCCCGGCGGCAACCC-3' 


SEQ ID NO: 51 
Mutagenic anti-sense strand 


5'-GGGTTGCCGCCGGGAGGGGGGCCGGTGCTGCC-3' 


Table 3. Mutation to remove peptide strain. 


Mutation site 


Native sequence and mutatgenic oligonucleotide 


SEQ ID NO: 52 

S99G- native sense strand 


5'-GGCTTTGTCACCCAGTCTGCGCAGAAGAACGTTGGC-3' 


SEQ ID NO: 53 
Mutagenic sense strand 


5'-GGCTTTGTCACCCAGGGTGCGCAGAAGAACGTTGGC-3' 


SEQ ID NO: 54 
Mutagenic anti-sense strand 


5*-GCCAACGTTCTTCTGCGCACCCTGGGTGACAAAGCC-3' 


Table 3b. Y245G analogs to remove product inhibition. 


Mutation site 


Native sequence and mutatgenic oligonucleotide 


SEQ ID NO: 55 

R25JA - native sense strand 


5'-CCGATAACAGATATGGCGGC-3' 


SEQ ID NO: 56 
Mutagenic sense strand 


5'-CCGATAACGCCTATGGCGGC-3' 


SEQ ID NO: 57 


5'-GCCGCCATAGGCGTTATCGG-3' 
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Mutagenic anti-sense strand 




SEQ ID NO: 58 


5'-CCCGGTGCCGTGCGCGGAAGCTGCTCCACC-3' 


R394A- native sense strand 




SEQ ID NO: 59 


5'-CCCGGTGCCGTGGCCGGAAGCTGCTCCACC-3' 


Mutagenic sense strand 




SEQ ID NO; 60 


5'-GGTGGAGCAGCTTCCGGCCACGGCACCGGG-3' 


Mutagenic anti-sense strand 






SEQ ID NO: 61 


5'-GCTGAGGAGGCAGAATTCGGCGGATCCTCTTTCTC-3' 


F338A- native sense strand 




SEQ ID NO: 62 


5'-GCTGAGGAGGCAGAAGCCGGCGGATCCTCTTTCTC-3* 


Mutagenic sense strand 




SEQ ID NO: 63 


5'-GAGAAAGAGGATCCGCCGGCTTCTGCCTCCTCAGC-3' 


Mutagenic anti-sense strand 






SEQ ID NO: 64 


5'-GGAACCCATACCGCCTGGGCAACACCAGC-3' 


R267A- native sense strand 




SEQ ID NO: 65 


5'-GGAACCCATACGCCCTGGGCAACACCAGC-3' 


Mutagenic sense strand 




SEQ ID NO: 66 


5'-GCTGGTGTTGCCCAGGGCGTATGGGTTCC-3' 


Mutagenic anti-sense strand 






SEQ ID NO: 67 


5'-CCTACCCGACAAACGAGACCTCCTCCACACCCGG-3' 


E385A- native sense strand 




SEQ ID NO: 68 


5'-CCTACCCGACAAACGCCACCTCCTCCACACCCGG-3' 


Mutagenic sense strand 




SEQ ID NO: 69 


5'-CCGGGTGTGGAGGAGGTGGCGTTTGTCGGGTAGG-3' 


Mutagenic anti-sense strand 
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Table 4. N to A mutations to remove glycosylation. 



Mutant 


Native sequence and mutagenic oligonucleotide 


SEQ ID NO: 70 

N45A - native sense strand 


5'-GGACTCACGCTACGAACAGCAGCACGAACTGC-3' 


SEQ ID NO: 71 
Mutagenic sense strand 


5'-GGACTCACGCTACGGCCAGCAGCACGAACTGC-3' 


SEQ ID NO: 72 
Mutagenic anti-sense strand 


5'-GCAGTTCGTGCTGCTGGCCGTAGCGTGAGTCC-3' 




SEQ ID NO: 73 

N270A - native sense strand 


5'-CCCATACCGCCTGGGCAACACCAGCTTCTACGGCCC-3' 


SEQ ID NO: 74 
Mutagenic sense strand 


5'-CCCATACCGCCTGGGCGCCACCAGCTTCTACGGCCC-3' 


SEQ ID NO: 75 
Mutagenic anti-sense strand 


5'-GGGCCGTAGAAGCTGGTGGCGCCCAGGCGGTATGGG-3* 




SEQ ID NO: 76 

N384A - native sense strand 


5'-GGACTCCACCTACCCGACAAACGAGACCTCCTCCACACCCG-3' 


SEQ ID NO: 77 

Mutagenic sense strand 


5'-GGACTCCACCTACCCGACAGCCGAGACCTCCTCCACACCCG-3' 


SEQ ID NO: 78 
Mutagenic anti-sense strand 


5'-CGGGTGTGGAGGAGGTCTCGGCTGTCGGGTAGGTGGAGTCC-3' 


Table 5. Helix capping mutations to improve thermal tolerance. 


Mutant 


Native sequence and mutagenic oligonucleotide 


SEQ ID NO: 79 

E337R - native sense strand 


5'-GCTGAGGAGGCAGAATTCGGCGG-3' 


SEQ ID NO: 80 
Mutagenic sense strand 


5-GCTGAGGAGGCACGCTTCGGCGG-3' 


SEQ ID NO: 81 
Mutagenic anti-sense strand 


5'-CCGCCGAAGCGTGCCTCCTCAGC-3* 




SEQ ID NO: 82 

N327D - native sense strand 


5'-GGCAACGAGCTCAACGATGATTACTGC-3' 


SEQ ID NO: 83 
Mutagenic sense strand 


5'-GGCAACGAGCTCGACGATGATTACTGC-3' 


SEQ ID NO: 84 
Mutagenic anti-sense strand 


5'-GCAGTAATCATCGTCGAGCTCGTTGCC-3' 




SEQ ID NO: 85 

A405D - native sense strand 


5'-CCGGTGTCCCTGCTCAGGTCGAATCTCAGTCTCCC-3' 


SEQ ID NO: 86 
Mutagenic sense strand 


5'-CCGGTGTCCCTGATCAGGTCGAATCTCAGTCTCCC-3' 


SEQ ID NO: 87 
Mutagenic anti-sense strand 


5*-GGGAGACTGAGATTCGACCTGATCAGGGACACCGG-3' 
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SEQ ID NO: 88 


5'-GCTCAGGTCGAATCTCAGTCTCCCAACGCC-3' 


Q410R 04 1 OR - native sense 




strand 




SEQ ID NO: 89 


5-GCTCAGGTCGAATCTCGCTCTCCCAACGCC-3 


Mutagenic sense strand 




SEQ ID NO: 90 


5'-GGCGTTGGGAGAGCGAGATTCGACCTGAGC-3' 


Mutagenic anti-sense strand 






SEQ ID NO: 91 


5'-CCCTATGTCCTGACAACGAGACCTGCGCG-3' 


N64D - native sense strand 




SEQ ID NO: 92 


5»-CCCTATGTCCTGACGACGAGACCTGCGCG-3' 


Mutagenic sense strand 




SEQ ID NO: 93 


5'-CGCGCAGGTCTCGTCGTCAGGACATAGGG-3' 


Mutagenic anti-sense strand 






SEQ ID NO: 94 


5*-GCTCGACCCTATGTCCTGACAACGAGACCTGCGCGAAGAACTGC-3' 


N64D - native sense strand 




SEQ ID NO: 95 


5'-GCTCGACCCTATGTCCTGACGACGAGACCTGCGCGAAGAACTGC-3' 


Mutagenic sense strand 




SEQ ID NO: 96 


5*-GCAGTTCTTCGCGCAGGTCTCGTCGTCAGGACATAGGGTCGAGC-3' 


Mutagenic anti-sense strand 





Legend for Tables 2, 3, 3b, 4 and 5. Amino acid mutations sites are listed in the left column. The 
first letter in the designation is the amino acid of the native protein based upon lUPAC 
convention for one-letter codes for amino acids. The number represents the amino acid location 
as designated from the start of the mature protein (excluding the signal peptide, i.e. QSA...). The 
letter designation after the number represents the amino acid that will occur as a result of the 
mutation. For example N64D represents the asparagine at site 64 changed to an aspartic acid. 
The native sense strand sequence for each site is listed in the right column with the 
oligonucleotide primers (sense and anti-sense) used to obtain the desired mutation below the 
native sequence in each case. In addition the codon for the targeted amino acid is bolded and the 
nucleotide substitutions in the mutagenic primers underlined. Li some cases only one nucleotide 
substitution was required the make the desired change, and in others 2 or 3 substitutions were 
required. In a few cases, double mutations were made with a single mutagenic oligonucleotide. 
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Claims 

1 through 5. (Cancelled) 

6. (Previously presented) A nucleic acid molecule having a nucleic acid sequence 
encoding a variant cellobiohydrolase that is mutated with respect to a wild-type 
cellobiohydrolase of SEQ ID NO: 5, the mutation providing means for improving functionality 
of the variant cellobiohydrolase with respect to the wild-type cellobiohydrolase. 

7. (Currently amended) The nucleic acid molecule of claim 6 wherein the means for 
improving is selected from the group consisting of: 

(a) proline substituted at a position selected from the group consisting of position 8, 27, 43, 
75, 94, 190, 195, 287, 299, 312, 315, 359, 398, 401, 414, 431, 433, and any combination 
thereof; 

(b) a helix-capping mutation defined as an arginine or aspartic acid residue is substituted at a 
position selected from the group consisting of position 64, 337, 327, 405, 410 and any 
combination thereof; 

(c) substitution of g lycine at position 99; 

(d) a deletion from the group consisting of position 99 101, position 278 279, and position 
387, and any combination thereof; 

£d}(a) [a] substitution of cysteine at positions 1 97 and 370; 

(e) (f) substitution of a non-glycosyl accepting amino acid residue in place of an N- 

glycosylation site amino acid residue at a position selected from the group consisting of 
position 45, 270, 384 and any combination thereof, 

(f}(g) alanine substitution at a position selected from the group consisting of position 45, 270, 
384 and any combination thereof; and 

(h) alanine at a position select e d from the group consisting of position 252, 291, 338, 267, 
385, and any combination ther e of; and 

26 

Application Serial # 10/031,496 



Appendix B 

Patent 

Attorney Docket # NREL 99-45 
£g}(i) any combination of the mutations of (a), (b), (c), (d), (e), (f), (g)j ^ wherein the 

positional reference is within m the amino acid sequence of the wild-type encoding a 

native cellobiohydrolase [I ]of SEQ ID NO: 5. 

8. (Previously presented) The nucleic acid molecule of claim 7 wherein the means for 
improving comprises the proline substituted at a position selected from the group consisting of 
position 8, 27, 43, 75, 94, 190, 195, 287, 299, 312, 315, 359, 398, 401, 414, 431, 433, and any 
combination thereof. 

9. (Previously presented) The nucleic acid molecule of claim 7 wherein the means for 
improving comprises the helix-capping mutation defined as an arginine or aspartic acid residue is 
substituted at a position selected from the group consisting of position 64, 337, 327, 405, 410 and 
any combination thereof 

10. (Currently amended) The nucleic acid molecule of claim 7 wherein the means for 
improving comprises substitution of the glycine at position 99. 

1 1 . (Currently amended) A method for mutating a nucleic acid encoding a wild type 
cellobiohydrolase of SEQ ID NO: 5, the method comprising: 

mutating the wild type cellobiohydrolase with a mutation selected from the group consisting of: 

(a) proline substituted at a position selected from the group consisting of position 8, 27, 43, 
75, 94, 190, 195, 287, 299, 312, 315, 359, 398, 401, 414, 431, 433, and any combination 
thereof; 

(b) a helix-capping mutation defined as an arginine or aspartic acid residue is substituted at a 
position selected from the group consisting of position 64, 337, 327, 405, 410 and any 
combination thereof; 

(c) substitution of g lycine at position 99; 

(d) a del e tion from the group consisting of posifion 99 101, position 278 279, and position 
387, and any combination th e reof; 
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(e)ffi substitution of a non-glycosyl accepting amino acid residue in place of an N- 

glycosylation site amino acid residue at a position selected from the group consisting of 
position 45, 270, 384 and any combination thereof, 

£f)fg) alanine substitution at a position selected from the group consisting of position 45, 270, 
384 and any combination thereof; mid 

(h) alanine at a position s e l e ct e d from th e group consisting of position 252, 294, 338, 267, 
385, and any combination th e reof; and 

(g](i) any combination of the mutations of (a), (b), (c), (d), (e), (f), (h)j wherein the 

positional reference is within an the amino acid sequence of the wild-type e ncoding a 
native cellobiohydrolase [I ]of SEQ ID NO: 5. 

1 2. (Currently amended) The method of claim 1 1 , wherein the mutation comprises 
substitution of a t he non-glycosyl accepting amino acid residue in place of an N-glycosylation 
site amino acid residue at a position selected from the group consisting of position 45, 270, 384 
and any combination thereof 

1 3 . (Previously presented) The method of claims 1 1 , wherein the step of mutating 
comprises site-directed mutagenesis. 

14. (Currently amended) The method of claim 1 1, fiirther comprising a step of shortening a 
linker region of the wild-type cellobiohydrolas e Gequenc e being short e n e d with respect to wild- 
type linker region SEQ ID NO: 2 to provide compris e s a linker region s e quence having a length 
of from about 6 amino acids 20 nucleotid e s to about 17 amino acids 50 nucleotid e s located, 
between a catalytic domain and a cellulose binding domain (CBD) of SEQ ID NO: 5. 

15. (Currently amended) An exoglucanase, comprising the sequence change encoded by 
SEQ ID NO: 7120. 

16. (Currently amended) An exoglucanase, comprising the sequence change encoded by 
SEQ ID NO: 7434-. 
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17. (Cancelled). 

18. (Cancelled). 

19. (Cancelled). 

20. (Currently amended) The nucleic acid molecule of claim 7 wherein the means for 
enhancing thermostability comprises substitution of a the cysteine at positions 197 and 370. 

2 1 . (Currently amended) The nucleic acid molecule of claim 7 wherein the means for 
enhancing thermostability comprises substitution of a the non-glycosyl accepting amino acid 
residue in place of an N-glycosylation site amino acid residue at a position selected from the 
group consisting of position 45, 270, 384 and any combination thereof. 

22. (Currently amended) The nucleic acid molecule of claim 7 wherein the means for 
enhancing thermostability comprises substitution of an t he alanine at a position selected from the 
group consisting of position 45, 270, 384 and any combination thereof 

23. (Cancelled). 

24. (Previously presented) The nucleic acid molecule of claim 7 wherein the means for 
improving comprises means for enhancing thermostability. 

25. (Currently amended) The nucleic acid molecule of claim i 6, wherein the variant 
cellobiohydrolase comprises a linker region sequ e nce having a length of from about 6 amino 
acids 20 nuclootideG to about 17 amino acids 50 nucleotid e s located, between a catalytic domain 
and a cellulose binding domain (CBD) , th e linlc e r region sequenc e b e ing shortened with resp e ct 
to SEQ ID NO: 2 . 

26. (Currently amended) A nucleic acid molecule having a nucleic acid sequence encoding a 
variant cellobiohydrolase that is mutated with respect to a wild-type cellobiohydrolase of SEQ 
ID NO: 5, the mutation selected from the group consisting of: 
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(a) proline substituted at a position selected from the group consisting of position 8, 27, 43, 
75, 94, 190, 195, 287, 299, 312, 315, 359, 398, 401, 414, 431, 433, and any combination 
thereof; 

(b) a helix-capping mutation defined as an arginine or aspartic acid residue is substituted at a 
position selected from the group consisting of position 64, 337, 327, 405, 410 and any 
combination thereof; 

(c) substitution of g lycine at position 99; 

(d) a d e l e tion from th e group consisting of position 99 101, position 278 - 279, and position 
387, and any combination th e r e of; 

{d}(e) [a1 substitution of cysteine at positions 197 and 370; 

(e}{f) substitution of a non-glycosyl accepting amino acid residue in place of an N- 

glycosylation site amino acid residue at a position selected from the group consisting of 
position 45, 270, 384 and any combination thereof, 

£jQ(g) alanine substitution at a position selected from the group consisting of position 45, 270, 
384 and any combination thereof; and 

(h) alanin e at a position s e l e cted from th e group consisting of position 252, 294, 338, 267, 
385, and any combination th e r e of; and 

{g}(i) any combination of the mutations of (a), (b), (c), (d), (e), (f), ^ wherein the 



positional reference is within an the amino acid sequence of the wild-type encoding a native 

cellobiohydrolase [I ]of SEQ ID NO: 5. ' 

27. (New) An exoglucanase, comprising the sequence change encoded by SEQ ID NO: 77. 

28. (New) An exoglucanase composition, comprising a combination of exoglucanases 
selected from the group consisting of exoglucanases defined by claims 15, 16 and 27. 
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ABSTRACT 

A nucl e ic acid molecul e having a nucl e ic acid sequonco that encodes a linlcor region of 
oxogluoanas e , — said — nucl e ic — aeid — s e qu e nc e — comprising — fee — nucl e otide — s e qu e nc e — 
nnrGrTAAACCCGCCTGGCACCACC 3' fSEQ ID NO: 3V T he disclosure provides a method 
for preparing an active exoglucanase in a heterologous host of eukarvotic origin. The method 
includes mutagenesis to reduce glvcosvlation of the exoglucanase when expressed in a 
heterologous host. It is further disclosed a method to produce variant cellobiohvdrolase that is 
stable at high temperature through mutagenesis. 
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